a hybrid geospatial data clustering method for hotspot analysis

نویسندگان

mohammad reza keyvanpour

department of computer engineering, alzahra university, tehran, iran mostafa javideh

shamsipoor technical college, tehran, iran mohammad reza ebrahimi

islamic azad university, qazvin branch, qazvin, iran

چکیده

traditional leveraging statistical methods for analyzing today’s large volumes of spatial data have high computational burdens. to eliminate the deficiency, relatively modern data mining techniques have been recently applied in different spatial analysis tasks with the purpose of autonomous knowledge extraction from high-volume spatial data. fortunately, geospatial data is considered a proper subject for leveraging data mining techniques. the main purpose of this paper is presenting a hybrid geospatial data clustering mechanism in order to achieve a high performance hotspot analysis method. the method basically works on 2 or 3-dimensional geographic coordinates of different natural and unnatural phenomena. it uses the systematic cooperation of two popular clustering algorithms: the aglomerative nestive, as a hierarchical clustering method and κ-means, as a partitional clustering method. it is claimed that the hybrid method will inherit the low time complexity of the κ-means algorithm and also relative independency from user’s knowledge of the agnes algorithm. thus, the proposed method is expected to be faster than agnes algorithm and also more accurate than κ-means algorithm. finally, the method was evaluated against two popular clustering measurement criteria. the first clustering evaluation criterion is adapted from fisher’s separability criterion, and the second one is the popular minimum total distance measure. results of evaluation reveal that the proposed hybrid method results in an acceptable performance. it has a desirable time complexity and also enjoys a higher cluster quality than its parents (agnes and κ-means). real-time processing of hotspots requires an efficient approach with low time complexity. so, the problem of time complexity has been taken into account in designing the proposed approach.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data

The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is pres...

متن کامل

A Clustering Based Hybrid System for Mass Spectrometry Data Analysis

Recently, much attention has been given to the mass spectrometry (MS) technology based disease classification, diagnosis, and protein-based biomarker identification. Similar to microarray based investigation, proteomic data generated by such kind of high-throughput experiments are often with high feature-to-sample ratio. Moreover, biological information and pattern are compounded with data nois...

متن کامل

Clustering-Based Method for Data Envelopment Analysis

Data Envelopment Analysis (DEA) is a powerful performance measurement in economic sector and operations research to assess the relative efficiency for each decision making unit (DMU). In general, there are two assumptions in DEA. Firstly, the DEA assumes that all DMUs are homogenous in their environments and secondly, the DEA is a deterministic approach which refers to not allow to noise or err...

متن کامل

A Hybrid Method for Sequence Clustering

The problem of sequence clustering is one of the fundamental research topics. However, most algorithms are dedicated to the case of single-label clustering. In this paper, we propose sequence clustering algorithms which can be applied for finding multi labels with respect to variable-length sequences. In our research, we first map sequences as vectors in the feature space by applying DCT transf...

متن کامل

Crime Hotspot Tracking and Geospatial Analysis in Merseyside, UK

Crime prediction is a topic of significant research across the fields of criminology, data mining, city planning, law enforcement, and political science. Crime patterns exist on a spatial level; these patterns can be grouped geographically by physical location, and analyzed contextually based on the region in which crime occurs. This paper proposes a mechanism to parameterize street-level crime...

متن کامل

Hybrid Method of Logistic Regression and Data Envelopment Analysis for Event Prediction: A Case Study (Stroke Disease)

Abstract Predictive analytics is an area of statistics that deals with extracting information from data and using it to predict trends and behavior patterns. Many mathematical modeling has been developed and used for prediction, and in some cases, they have been found to be very strong and reliable. This paper studies different mathematical and statistical approaches for events prediction. The ...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید


عنوان ژورنال:
journal of computer and robotics

جلد ۳، شماره ۱، صفحات ۵۳-۶۷

میزبانی شده توسط پلتفرم ابری doprax.com

copyright © 2015-2023